-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Add support for forward translog reading #20163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughAdds an index-level boolean setting to enable translog forward reading, threads the flag from IndexSettings through Translog into MultiSnapshot, updates MultiSnapshot to traverse snapshots forward when enabled, and adds tests for replication, recovery, and local translog snapshot behavior. Changes
Sequence DiagramsequenceDiagram
participant Config as Index Config
participant Settings as IndexSettings
participant Translog as Translog
participant Multi as MultiSnapshot
participant Files as Translog Files
rect rgb(240,248,255)
Config->>Settings: load index settings
Settings->>Settings: read INDEX_TRANSLOG_READ_FORWARD_SETTING
end
rect rgb(255,250,240)
Translog->>Settings: isTranslogReadForward()
Translog->>Multi: new MultiSnapshot(snapshots, onClose, readForward)
alt readForward == true
Multi->>Multi: startIndex = 0 (forward)
else
Multi->>Multi: startIndex = translogs.length - 1 (backward)
end
end
rect rgb(240,255,240)
loop replay snapshots
alt forward
Multi->>Files: read snapshot at ascending index
Files-->>Multi: operations (forward order)
else backward
Multi->>Files: read snapshot at descending index
Files-->>Multi: operations (reverse order)
end
Multi->>Multi: deduplicate by seqNo and yield operations
end
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
🔇 Additional comments (4)
Comment |
|
❌ Gradle check result for c968213: null Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: xuxiong1 <[email protected]>
Signed-off-by: xuxiong1 <[email protected]>
Signed-off-by: xuxiong1 <[email protected]>
c968213 to
234b0c9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (3)
server/src/test/java/org/opensearch/index/replication/IndexLevelReplicationTests.java (1)
638-704: Consider extracting common test logic.The test duplicates 66 lines from
testSeqNoCollision()(lines 571-636), differing only in the setting on line 646. While explicit duplication in tests aids clarity, you might consider a parameterized test or helper method to reduce maintenance overhead if the test logic evolves.Example approach using a helper method:
public void testSeqNoCollision() throws Exception { testSeqNoCollisionWithSettings(Settings.builder() .put(IndexSettings.INDEX_SOFT_DELETES_SETTING.getKey(), true) .put(IndexSettings.INDEX_TRANSLOG_RETENTION_AGE_SETTING.getKey(), "-1") .put(IndexSettings.INDEX_TRANSLOG_RETENTION_SIZE_SETTING.getKey(), "-1") .build()); } public void testSeqNoCollisionWithReadForward() throws Exception { testSeqNoCollisionWithSettings(Settings.builder() .put(IndexSettings.INDEX_SOFT_DELETES_SETTING.getKey(), true) .put(IndexSettings.INDEX_TRANSLOG_RETENTION_AGE_SETTING.getKey(), "-1") .put(IndexSettings.INDEX_TRANSLOG_RETENTION_SIZE_SETTING.getKey(), "-1") .put(IndexSettings.INDEX_TRANSLOG_READ_FORWARD_SETTING.getKey(), true) .build()); } private void testSeqNoCollisionWithSettings(Settings settings) throws Exception { // Common test logic here }server/src/test/java/org/opensearch/index/translog/LocalTranslogTests.java (1)
3876-3934: Forward-read snapshot test wiring looks correctThe test correctly:
- Constructs a translog with
INDEX_TRANSLOG_READ_FORWARD_SETTINGenabled viagetTranslogConfig(tempDir, settings).- Uses a separate
LocalTransloginstance so it doesn’t interfere with the class-leveltranslog.- Populates
viewsin generation order and asserts thatnewSnapshot()yields the concatenated operations in forward (oldest→newest) order.This aligns with the new forward-reading
MultiSnapshotsemantics and gives good coverage for multi-generation snapshots.As a minor optional clean-up, you could factor the common setup logic between this test and
testSnapshotReadOperationInReverseinto a small helper to reduce duplication, but it’s not strictly necessary here.server/src/test/java/org/opensearch/indices/recovery/RecoveryTests.java (1)
533-572: Read-forward recovery test mirrors baseline behavior appropriatelyThis test cleanly mirrors
testRecoveryTrimsLocalTranslogwhile:
- Enabling
INDEX_TRANSLOG_READ_FORWARD_SETTINGon the index.- Creating the replication group with an
InternalEngineFactory, keeping the execution model aligned with the baseline test.- Reusing the same flow (in-flight docs, replica promotion/demotion, recovery, and consistency assertions), which is exactly what we need to validate trimming semantics under forward translog reading.
The coverage looks solid and should catch regressions specific to read-forward mode.
If you want to reduce maintenance overhead later, you could extract the common body of this test and
testRecoveryTrimsLocalTransloginto a shared helper that takes theSettings(or aboolean readForward) as a parameter, but this duplication is acceptable as-is.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
CHANGELOG.md(1 hunks)server/src/main/java/org/opensearch/index/IndexSettings.java(4 hunks)server/src/main/java/org/opensearch/index/translog/MultiSnapshot.java(2 hunks)server/src/main/java/org/opensearch/index/translog/Translog.java(1 hunks)server/src/test/java/org/opensearch/index/replication/IndexLevelReplicationTests.java(1 hunks)server/src/test/java/org/opensearch/index/translog/LocalTranslogTests.java(1 hunks)server/src/test/java/org/opensearch/indices/recovery/RecoveryTests.java(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
- GitHub Check: gradle-check
- GitHub Check: precommit (25, ubuntu-24.04-arm)
- GitHub Check: precommit (25, ubuntu-latest)
- GitHub Check: precommit (21, windows-2025, true)
- GitHub Check: precommit (25, macos-15-intel)
- GitHub Check: precommit (25, windows-latest)
- GitHub Check: precommit (25, macos-15)
- GitHub Check: precommit (21, ubuntu-latest)
- GitHub Check: precommit (21, macos-15-intel)
- GitHub Check: precommit (21, ubuntu-24.04-arm)
- GitHub Check: precommit (21, macos-15)
- GitHub Check: precommit (21, windows-latest)
- GitHub Check: Analyze (java)
- GitHub Check: detect-breaking-change
- GitHub Check: assemble (21, ubuntu-latest)
- GitHub Check: assemble (25, windows-latest)
- GitHub Check: assemble (25, ubuntu-latest)
- GitHub Check: assemble (25, ubuntu-24.04-arm)
- GitHub Check: assemble (21, windows-latest)
- GitHub Check: assemble (21, ubuntu-24.04-arm)
🔇 Additional comments (4)
server/src/main/java/org/opensearch/index/translog/Translog.java (1)
764-765: LGTM!The implementation correctly reads the new
isTranslogReadForward()setting from IndexSettings and passes it to the MultiSnapshot constructor, enabling bidirectional translog reading based on index configuration.server/src/main/java/org/opensearch/index/IndexSettings.java (1)
914-914: Translog read‑forward flag wiring looks correct; confirm non‑dynamic lifecycle is intentionalThe new
translogReadForwardflag is:
- Backed by
INDEX_TRANSLOG_READ_FORWARD_SETTINGwith defaultfalse.- Read once at construction time from
settings.- Stored in a
finalfield and exposed viaisTranslogReadForward().This means the read direction is effectively fixed for the lifetime of the
IndexSettingsinstance and cannot be changed via a dynamic settings update (the setting also lacksProperty.Dynamicand there is no update consumer).If the intent is “configure at index creation (or close/open) only”, this is perfectly fine and keeps behavior simple. If you expect operators to toggle read‑forward on an already‑open index, you’d need to:
- Mark the setting as
Property.Dynamic.- Store it in a
volatilefield instead offinal.- Register an update consumer on
scopedSettingssimilar to other translog settings.Given the sensitivity of recovery semantics, the non‑dynamic approach is probably safer, but it’s worth double‑checking that this matches your operational expectations.
Also applies to: 1135-1135, 2102-2107
server/src/main/java/org/opensearch/index/translog/MultiSnapshot.java (2)
57-70: Constructor wiring forreadForwardflag and index initialization looks soundThe added
readForwardflag and constructor wiring are consistent:
readForwardisfinal, so direction is immutable per snapshot.index = readForward ? 0 : translogs.length - 1;correctly handles both directions and the empty‑array case (0vs-1with loops guarding against out‑of‑bounds).No correctness issues here.
84-111: Forward vs backward traversal shares correct dedupe semantics; relies on trim preconditionThe new
next()implementation cleanly bifurcates behavior:
- Forward path (
readForward == true): iteratesindexfrom0totranslogs.length - 1, consuming eachTranslogSnapshotin order and usingseenSeqNo.getAndSet+overriddenOperationsin the same way as before.- Backward path keeps the original behavior (from
translogs.length - 1down to0) with identical per‑operation logic.A few points worth noting:
- The reuse of the same inner loop and
SeqNoSetlogic in both branches preserves the existing semantics of “first occurrence wins with respect to the chosen traversal order” and keepsoverriddenOperationsaccounting correct.- With backward reading, “first occurrence” means “latest generation wins”, which avoids stale operations from older primary terms.
- With forward reading, “first occurrence” means “oldest generation wins”. This matches the risk described in the new index setting Javadoc: if trimming of stale operations (
trimOperationOfPreviousPrimaryTerms(...)) hasn’t happened yet, forward traversal can surface outdated ops from older primary terms.Given that:
- As long as forward reading is only enabled in flows where stale‑term trimming is guaranteed to have run before constructing this
MultiSnapshot, this implementation is consistent with the documented behavior.- If you want extra safety, you could consider adding assertions or tighter invariants at the call site (e.g., around when forward snapshots are created) to ensure we don’t accidentally use forward traversal in a pre‑trim state, but that’s optional and outside this class.
Overall, the bidirectional iteration logic here is correct and symmetric.
Signed-off-by: xuxiong1 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
CHANGELOG.md(1 hunks)server/src/main/java/org/opensearch/index/IndexSettings.java(4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- CHANGELOG.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
- GitHub Check: gradle-check
- GitHub Check: precommit (25, windows-latest)
- GitHub Check: precommit (25, ubuntu-24.04-arm)
- GitHub Check: precommit (21, windows-2025, true)
- GitHub Check: precommit (25, macos-15-intel)
- GitHub Check: precommit (21, ubuntu-24.04-arm)
- GitHub Check: precommit (21, macos-15)
- GitHub Check: precommit (21, ubuntu-latest)
- GitHub Check: precommit (25, macos-15)
- GitHub Check: precommit (21, macos-15-intel)
- GitHub Check: precommit (25, ubuntu-latest)
- GitHub Check: precommit (21, windows-latest)
- GitHub Check: detect-breaking-change
- GitHub Check: assemble (25, ubuntu-24.04-arm)
- GitHub Check: Analyze (java)
- GitHub Check: assemble (21, ubuntu-latest)
- GitHub Check: assemble (25, windows-latest)
- GitHub Check: assemble (25, ubuntu-latest)
- GitHub Check: assemble (21, windows-latest)
- GitHub Check: assemble (21, ubuntu-24.04-arm)
🔇 Additional comments (3)
server/src/main/java/org/opensearch/index/IndexSettings.java (3)
915-915: LGTM! Field declaration is correct.The field is appropriately declared as
finalsince the setting is not dynamic.
1136-1136: LGTM! Field initialization is correct.The initialization follows the standard pattern for non-dynamic IndexScope settings.
2103-2108: LGTM! Getter method is correctly implemented.The method follows standard naming conventions and provides appropriate access to the setting value.
Signed-off-by: xuxiong1 <[email protected]>
|
❌ Gradle check result for fe0811d: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Description
This commit introduces a feature to read translog operations in forward order
(oldest to newest) instead of the default backward order (newest to oldest).
Changes:
index.translog.read_forwardsetting (default: false) inIndexSettingsTests:
Related Issues
Resolves #20094
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.
Summary by CodeRabbit
New Features
Tests
Documentation
✏️ Tip: You can customize this high-level summary in your review settings.